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DETAILED ACTION 
Continued Examination Under 37 CFR 1.114 

1 . A request for continued examination under 37 CFR 1.114, including tine fee set 
forth in 37 CFR 1 .17(e), was filed in this application after final rejection. Since this 
application is eligible for continued examination under 37 CFR 1.114, and the fee set 
forth in 37 CFR 1 .17(e) has been timely paid, the finality of the previous Office action 
has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 
01/05/2009 has been entered. 



Response to Arguments 

2. Applicant's arguments filed 01/05/2009 have been fully considered but they are 
not persuasive. Examiner has addressed the newly amended claim limitations relative 
to claims 1,13, and 14, and maintains the previously cited art of Shizuka and King with 
respect to said amended limitations. Examiner maintains that the combined teaching of 
Shizuka and King are both within the scope of the present invention relative to speech 
output through user interaction, wherein Shizuka in view of King clearly teaches and 
suggests the use of a device that verbally outputs information to aid for example a 
visually impaired individual as is explicitly taught by King and is directly within the scope 
of the present invention (present invention spec, pages 2, 3, and 1 7). Further, the use 
of "motion" is merely construed as the operation a user is performing, wherein King 
clearly demonstrates the description an executed step by a user i.e. "motion" (though 
the term motion is not used). This teaches is also explicitly taught by King and is 
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directly in parallel to the teaching of the present invention regarding motion (present 
invention spec, page 10, motion phonetically output after execution of a button being 
pressed). Examiner maintains arguments as previously cited: 

King teaches an assistive technology application 212 that produces speech 
information corresponding to the screen image information . In the embodiment of FIG. 
2, the speech information conveys human speech which verballv describes general 
attributes (e.g.. color, shape, size, and the like) of the screen image and any objects 
(e.g.. menus, dialog boxes, icons, text, and the like) within the screen image, and also 
includes semantic information conveying the meaning, significance, or intended purpose 
of each of the objects within the screen image . The speech information may include, for 
example, text-to-speech (TTS) commands and/or audio output signals . Suitable 
assistive technology applications are known and commercially available. ) The 
assistive technology application 212 provides the speech information to a speech 
application program interface (API) 214. The speech application program interface 
(API) 214 provides a standard means of accessing routines and services within an 
operating system of the server 102 (King Col. 5 lines 45-65 & Fig. 2). 

Further, King teaches that the console access application 202 of the client 104A 
are configured to cooperate such that th e user of the client 104A is able to interact with 
the server 1 02 as if the user were operating the server 102 locally. As shown in FIG. 2, 
the client 104A includes an input device 220 . The input device 220 may be for example, 
a keyboard, a mouse, or a voice recognition system . When the user of the client 104A 
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activates the input device 220 (e.g., presses a keyboard key, moves a mouse, or 
activates a mouse button), the input device 220 produces one or more input signals 
(i.e., "input signals"), and provides the input signals to the distributed console access 
application 202. The distributed console access application 202 transmits the input 
signals to the distributed console access application 200 of the server 102. (King Col. 6 
lines 41-56). 

Furthermore, King teaches that when the user of the client 104A is visually 
impaired, the user may not be able to see the screen image displayed on the display 
screen 21 0 of the client 1 04A. However, when the audio output device 230 produces 
the verbal description of the screen image, the visually-impaired user may hear the 
description, and understand not only the general appearance of the screen image and 
any obiects within the screen image (e.g.. color, shape, size, and the like), but also the 
meaning, significance, or intended purpose of any obiects within the screen image as 
well (e.g.. menus, dialog boxes, icons, and the like). This ability for a visually-impaired 
user to hear the verbal description of the screen image and to know the meaning, 
significance, or intended purpose of any objects within the screen image allows the user 
of the client 1 04A to interact with the obiects in a proper, meaningful, and expected way. 
(King Col. 7 lines 49-64). 
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Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the phor art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary sl^ill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claim 1 , 6-8, and 1 3-1 5 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over Shizuka et al. US 20020184004 A1 (hereinafter Shizuka) King et al. 
US 7103551 B2 (hereinafter King). 

Re claims 1 and 13-15, Shizuka teaches a control method for an apparatus that 
includes a help button, an execution button and a third button, wherein the help button 
is for setting a help mode or setting a normal mode by canceling the help mode as a 
state of the apparatus ([0231], user selects from a OK button, cancel button, help 
button, new voice button, etc.), the method comprising: 

a state determination step of determining whether [[a]] the state of the apparatus 
is [[a]] the normal mode or [[a]] the help mode 

a button determination step of determining which of the help button, the 
execution button and the third button is selected ([0231], user selects from a OK button, 
cancel button, help button, new voice button, etc.); 

a setting step of setting the state of the apparatus in the help mode, in a case 
where it is determined in said state determination step that the state of the apparatus is 
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the normal mode and it is determined in said button determination step that the help 
button is selected ([0240], a help window will be present when in help mode); 

a first execution step of executing a motion corresponding to a button determined 
in said button determination step ([0240], a help window will be present when in help 
mode, Fig. 24 shows normal mode when no help window Is present), in a case where it 
is determined in said state determination step that the state of the apparatus is the 
normal mode and it is determined in said button determination step that the execution 
button or the third button is selected ([0231], user selects from a OK button, cancel 
button, help button, new voice button, etc.) 

a cancellation step of canceling the help mode of the apparatus, in a case where 
it is determined in said state determination step that the state of the apparatus is the 
help mode and it is determined in said button determination step that the help button is 
selected ([0231], user selects from a OK button, cancel button, help button, new voice 
button, etc., canceling help mode will revert back to normal mode); 

an output-storage step of phonetically outputting a description of a motion 
corresponding to the third button and storing the motion corresponding to the third 
button ([0231], user selects from a OK button, cancel button, help button, new voice 
button, etc.) 

in a storage device, in a case where it is determined in said state determination 
step that the state of the apparatus is the help mode and it is determined in said button 
determination step that the third button is selected ([0231], user selects from multiplicity 
of button such as a OK button, cancel button, help button, new voice button, etc.); and 
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a second execution step of executing tlie motion stored in the storage device, in 
a case where it is determined in said state determination step that the state of the 
apparatus is the help mode and it is determined in said button determination step that 
the execution button is selected ([0231], user selects from multiplicity of button such as 
a OK button, cancel button, help button, new voice button, etc., wherein OK is merely 
an execution action/motion). 

However, Shizuka fails to teach phonetically outputting a description of a motion 
King teaches an assistive technology application 212 that produces speech 
information corresponding to the screen image information. In the embodiment of FIG. 
2, the speech information conveys human speech which verbally describes general 
attributes (e.g., color, shape, size, and the like) of the screen image and any objects 
(e.g., menus, dialog boxes, icons, text, and the like) within the screen image, and also 
includes semantic information conveying the meaning, significance, or intended purpose 
of each of the objects within the screen image. The speech information may include, for 
example, text-to-speech (TTS) commands and/or audio output signals. Suitable 
assistive technology applications are known and commercially available. ) The 
assistive technology application 212 provides the speech information to a speech 
application program interface (API) 214. The speech application program interface 
(API) 214 provides a standard means of accessing routines and services within an 
operating system of the server 102 (King Col. 5 lines 45-65 & Fig. 2). 



Application/Control Number: 10/799,645 Page 8 

Art Unit: 2626 

Further, King teaches that the console access application 202 of the client 104A 
are configured to cooperate such that the user of the client 104A is able to interact with 
the server 102 as if the user were operating the server 102 locally. As shown in FIG. 2, 
the client 104A includes an input device 220. The input device 220 may be for example, 
a keyboard, a mouse, or a voice recognition system. When the user of the client 104A 
activates the input device 220 (e.g., presses a keyboard key, moves a mouse, or 
activates a mouse button), the input device 220 produces one or more input signals 
(i.e., "input signals"), and provides the input signals to the distributed console access 
application 202. The distributed console access application 202 transmits the input 
signals to the distributed console access application 200 of the server 102. (King Col. 6 
lines 41-56). 

Furthermore, King teaches that when the user of the client 104A is visually 
impaired, the user may not be able to see the screen image displayed on the display 
screen 21 0 of the client 1 04A. However, when the audio output device 230 produces 
the verbal description of the screen image, the visually-impaired user may hear the 
description, and understand not only the general appearance of the screen image and 
any objects within the screen image (e.g., color, shape, size, and the like), but also the 
meaning, significance, or intended purpose of any objects within the screen image as 
well (e.g., menus, dialog boxes, icons, and the like). This ability for a visually-impaired 
user to hear the verbal description of the screen image and to know the meaning, 
significance, or intended purpose of any objects within the screen image allows the user 
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of the client 104A to interact with the objects in a proper, meaningful, and expected way. 
(King Col. 7 lines 49-64). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Shizuka to incorporate phonetically 
outputting a description of a motion as taught by King to allow for a system that can 
detect state changes and output the changes verbally to a user, wherein a description is 
transmitted to notify a user what is happening in a speech synthesis system through a 
verbal description that is clear enough where someone who is visually impaired can 
function as effectively as someone without impairment, where an audio help mode state 
is indicated to be turned off or on depending on the mode selected by a user where 
he/she will know if they are in help mode or not (King Col. 7 lines 49-64). 

Re claim 6, Shizuka teaches the method according to claim 1 , further comprising 
a termination step of terminating audio output being currently outputted in a case where 
operation performed on the apparatus is detected (Fig. 34 items S48 and S49). 

Re claim 7, Shizuka teaches the method according to claim 1 , further comprising 
a second audio output step of phonetically outputting a motion result in said second 
execution step ([0305] - [0306]). 

Re claim 8, Shizuka teaches the method according to claim 1 , further comprising: 
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an acquisition step of acquiring a name of a motion ([0231], user selects from 
multiplicity of button such as a OK button, cancel button, help button, new voice button, 
etc., wherein OK is merely an execution action/motion) corresponding to the third 
button, in a case where it is determined in said state determination step that the state of 
the apparatus is the help mode and it is determined in said button determination step 
that the third button is selected; an ([0225]); 

However, a second audio output step of phonetically outputting the name before 
phonetically outputting the description of the motion in said output storage step (King 
Col. 7 lines 49-64); 

King teaches an assistive technology application 212 that produces speech 
information corresponding to the screen image information. In the embodiment of FIG. 
2, the speech information conveys human speech which verbally describes general 
attributes (e.g., color, shape, size, and the like) of the screen image and any objects 
(e.g., menus, dialog boxes, icons, text, and the like) within the screen image, and also 
includes semantic information conveying the meaning, significance, or intended purpose 
of each of the objects within the screen image. The speech information may include, for 
example, text-to-speech (TTS) commands and/or audio output signals. Suitable 
assistive technology applications are known and commercially available. ) The 
assistive technology application 212 provides the speech information to a speech 
application program interface (API) 214. The speech application program interface 
(API) 214 provides a standard means of accessing routines and services within an 
operating system of the server 102 (King Col. 5 lines 45-65 & Fig. 2). 
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Further, King teaches that the console access application 202 of the client 104A 
are configured to cooperate such that the user of the client 104A is able to interact with 
the server 102 as if the user were operating the server 102 locally. As shown in FIG. 2, 
the client 104A includes an input device 220. The input device 220 may be for example, 
a keyboard, a mouse, or a voice recognition system. When the user of the client 104A 
activates the input device 220 (e.g., presses a keyboard key, moves a mouse, or 
activates a mouse button), the input device 220 produces one or more input signals 
(i.e., "input signals"), and provides the input signals to the distributed console access 
application 202. The distributed console access application 202 transmits the input 
signals to the distributed console access application 200 of the server 102. (King Col. 6 
lines 41-56). 

Furthermore, King teaches that when the user of the client 104A is visually 
impaired, the user may not be able to see the screen image displayed on the display 
screen 21 0 of the client 1 04A. However, when the audio output device 230 produces 
the verbal description of the screen image, the visually-impaired user may hear the 
description, and understand not only the general appearance of the screen image and 
any objects within the screen image (e.g., color, shape, size, and the like), but also the 
meaning, significance, or intended purpose of any objects within the screen image as 
well (e.g., menus, dialog boxes, icons, and the like). This ability for a visually-impaired 
user to hear the verbal description of the screen image and to know the meaning, 
significance, or intended purpose of any objects within the screen image allows the user 
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of the client 104A to interact with the objects in a proper, meaningful, and expected way. 
(King Col. 7 lines 49-64). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Shizuka to incorporate outputting a 
description of the motion corresponding to the operation In a case where the state of the 
apparatus Is the help mode and a description of the motion corresponding to the 
operation in a case where it is determined in said execution determination step that the 
operation detected in said operation detection step does not designate the execution of 
motion as taught by King to allow for a system that can detect state changes and output 
the changes verbally to a user, wherein a description is transmitted to notify a user what 
is happening in a speech synthesis system through a verbal description that is clear 
enough where someone who is visually impaired can function as effectively as someone 
without Impairment, where an audio help mode state Is Indicated to be turned off or on 
depending on the mode selected by a user where he/she will know if they are In help 
mode or not (King Col. 7 lines 49-64). 

5. Claims 9-12 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Shizuka et al. US 20020184004 A1 (hereinafter Shizuka) King et al. US 7103551 B2 
(hereinafter King) and further in view of Surace et al. US 6334103 B1 (hereinafter 
Surace). 

Re claim 9, Shizuka teaches the method according to claim 1 , further comprising: 
a changing step of changing sound quality of output speech (Fig. 24) 
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However, Shizuka in view of King fails to teach a determination step of 
determining whether or not one same operation has been repeatedly performed on the 
apparatus (Surace Col. 10 lines 22-30); 

from the speech outputted last, in a case where it is determined in said 
determination step one same operation has been repeatedly performed (Surace Col. 10 
lines 22-30). 

Surace teaches a voice user interface with personality, wherein it is determined 
whether the user is requiring repeated help in the same session or across sessions (i.e., 
a user is requiring help more than once in the current session). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Shizuka in view of King to incorporate 
changing sound quality after determining whether or not the same operation has been 
repeatedly performed as taught by Surace to allow for a voice quality adjustment to be 
implemented such as a personality of a user interface dependent on how many times an 
operation is repeated (based on social and psychological experimental data) (Surace 
Col. 10 lines 22-30). 

Re claim 10, Shizuka teaches the method according to claim 9, wherein in said 
changing step, vocalize speed of the output speech is changed (Fig. 24). 
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Re claim 1 1 , Sliizul<a in view of King fails to teach the method according to claim 
9, wherein in said changing step, volume of the output speech is changed (Surace Col. 
22 lines 44-49). 

Surace teaches the editing of audio tapes of the recorded scripts (e.g., to adjust 
volume and ensure smooth audio transitions within dialogs). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of the invention to modify the system of Shizuka in view of King to incorporate 
changing the volume of the output speech as taught by Surace to allow for a voice 
quality adjustment to be implemented such as a personality of a user interface 
dependent on how many times an operation is repeated (based on social and 
psychological experimental data). Various parameters such as pitch, speed, clarity, and 
intonation can be varied to alter the personality of a voice interface (Surace Col. 22 lines 
44-49). 

Re claim 12, Shizuka teaches the method according to claim 9, wherein in said 
changing step, vocal quality of the output speech is changed (Fig. 24). 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Michael C. Colucci whose telephone number is (571)- 
270-1847. The examiner can normally be reached on 9:30 am - 6:00 pm, Monday- 
Friday. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571 )-272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retheval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



/Michael 0 Colucci/ 
Examiner, Art Unit 2626 
Patent Examiner 
AU 2626 
(571)-270-1847 
Michael.Colucci(S)uspto.qov 



/Richemond Dorvil/ 

Supervisory Patent Examiner, Art Unit 2626 



